22 research outputs found

    The Role of Preprocessing for Word Representation Learning in Affective Tasks

    Get PDF
    Affective tasks, including sentiment analysis, emotion classification, and sarcasm detection have drawn a lot of attention in recent years due to a broad range of useful applications in various domains. The main goal of affect detection tasks is to recognize states such as mood, sentiment, and emotions from textual data (e.g., news articles or product reviews). Despite the importance of utilizing preprocessing steps in different stages (i.e., word representation learning and building a classification model) of affect detection tasks, this topic has not been studied well. To that end, we explore whether applying various preprocessing methods (stemming, lemmatization, stopword removal, punctuation removal and so on) and their combinations in different stages of the affect detection pipeline can improve the model performance. The are many preprocessing approaches that can be utilized in affect detection tasks. However, their influence on the final performance depends on the type of preprocessing and the stages that they are applied. Moreover, the preprocessing impacts vary across different affective tasks. Our analysis provides thorough insights into how preprocessing steps can be applied in building an effect detection pipeline and their respective influence on performance

    Conversation Derailment Forecasting with Graph Convolutional Networks

    Full text link
    Online conversations are particularly susceptible to derailment, which can manifest itself in the form of toxic communication patterns like disrespectful comments or verbal abuse. Forecasting conversation derailment predicts signs of derailment in advance enabling proactive moderation of conversations. Current state-of-the-art approaches to address this problem rely on sequence models that treat dialogues as text streams. We propose a novel model based on a graph convolutional neural network that considers dialogue user dynamics and the influence of public perception on conversation utterances. Through empirical evaluation, we show that our model effectively captures conversation dynamics and outperforms the state-of-the-art models on the CGA and CMV benchmark datasets by 1.5\% and 1.7\%, respectively.Comment: WOAH, AC

    Knowledge-aware Complementary Product Representation Learning

    Full text link
    Learning product representations that reflect complementary relationship plays a central role in e-commerce recommender system. In the absence of the product relationships graph, which existing methods rely on, there is a need to detect the complementary relationships directly from noisy and sparse customer purchase activities. Furthermore, unlike simple relationships such as similarity, complementariness is asymmetric and non-transitive. Standard usage of representation learning emphasizes on only one set of embedding, which is problematic for modelling such properties of complementariness. We propose using knowledge-aware learning with dual product embedding to solve the above challenges. We encode contextual knowledge into product representation by multi-task learning, to alleviate the sparsity issue. By explicitly modelling with user bias terms, we separate the noise of customer-specific preferences from the complementariness. Furthermore, we adopt the dual embedding framework to capture the intrinsic properties of complementariness and provide geometric interpretation motivated by the classic separating hyperplane theory. Finally, we propose a Bayesian network structure that unifies all the components, which also concludes several popular models as special cases. The proposed method compares favourably to state-of-art methods, in downstream classification and recommendation tasks. We also develop an implementation that scales efficiently to a dataset with millions of items and customers

    Qualitative Analysis of Userbased and Item-based Prediction Algorithms for Recommendation Systems, CIA 2004

    No full text
    Abstract. Recommendation agents employ prediction algorithms to provide users with items that match their interests. In this paper, we describe and evaluate several prediction algorithms, some of which are novel in that they combine user-based and item-based similarity measures derived from either explicit or implicit ratings. We compare both statistical and decision-support accuracy metrics of the algorithms against different levels of data sparsity and different operational thresholds. The first metric evaluates the accuracy in terms of average absolute deviation, while the second evaluates how effectively predictions help users to select high-quality items. Our experimental results indicate better performance of item-based predictions derived from explicit ratings in relation to both metrics. Category-boosted predictions can lead to slightly better predictions when combined with explicit ratings, while implicit ratings (in the sense that we have defined them here) perform much worse than explicit ratings.

    Επιλύοντας τα Προβλήματα Σποραδικότητας και Κλιμακοσημότητας των Αλγορίθμων Συστάσεων

    No full text
    The World-Wide-Web has emerged during the last decade as one of the most prominent research fields. However, its size, heterogeneity and complexity to a large extent overcome our ability to efficiently manipulate data using traditional techniques. In order to cope with these characteristics several Web applications require intelligent tools that may help to extract the proper information relevant to the users requests. In this thesis we report on the algorithmic aspects of recommendation technologies, which refer to algorithms and systems that have been developed to help users find items that may be of their interest from a variety of available items. Collaborative Filtering (CF), the prevalent method for providing recommendations, has been successfully adopted by research and industrial applications. However, its applicability is limited due to the sparsity and the scalability problems. Sparsity refers to a situation that transactional data are lacking or are insufficient, while scalability refers to the expensive computations required by CF. For addressing the scalability problem we propose a method of Incremental CF (ICF) that is based on incremental updates of user-to-user similarities. Our ICF algorithm (i) is not based on any approximation method, thus it gives the potential for high-quality recommendations formulation, and (ii) provides recommendations orders of magnitude faster than classic CF and thus, is suitable for online application. To provide high-quality recommendations even when data are sparse, we propose a method for alleviating sparsity using trust inferences. Trust inferences are transitive associations between users in the context of an underlying social network and are valuable sources of additional information that help dealing with the sparsity and the cold-start problems. Our experimental evaluation indicates that our method of trust inferences significantly improves the quality performance of the classic CF method. Finally, we provide a roadmap for future research directions that extend recommendation technologies to more complex types of applications and identify various research opportunities for developing them.Ο Παγκόσμιος Ιστός στη διάρκεια της τελευταίας δεκαετίας έχει αναδειχθεί σε ένα από τα σημαντικότερα πεδία έρευνας. Εντούτοις, το μέγεθος, η ετερογένεια και η πολυπλοκότητά του υπερισχύουν σε μεγάλο βαθμό της δυνατότητά μας να χειριστούμε αποτελεσματικά τα δεδομένα χρησιμοποιώντας παραδοσιακές τεχνικές. Προκειμένου να αντιμετωπιστούν αυτά τα χαρακτηριστικά διάφορες εφαρμογές Ιστού απαιτούν την ανάπτυξη και υιοθέτηση ευφυών εργαλείων για την επιλογή κατάλληλων πληροφοριών σχετικών με τα αιτήματα του χρήστη. Σε αυτήν την εργασία εξετάζουμε τις αλγοριθμικές πτυχές των τεχνολογιών σύστασης, οι όποιες αναφέρονται στους αλγορίθμους και τα συστήματα που έχουν αναπτυχθεί για να βοηθήσουν τους χρήστες να βρουν αντικείμενα που πιθανόν θα τους φανούν ενδιαφέροντα. Η «Συνεργατική Διήθηση» (ΣΔ), η επικρατούσα μέθοδος για τη δημιουργία συστάσεων, έχει υιοθετηθεί επιτυχώς από ερευνητικές και εμπορικές εφαρμογές. Εντούτοις, η δυνατότητα εφαρμογής της περιορίζεται λόγω των προβλημάτων «σποραδικότητας» και «κλιμακοσημότητας». Η σποραδικότητα αναφέρεται σε μια κατάσταση που τα δεδομένα συναλλαγών μεταξύ του χρήστη και του συστήματος στερούνται ή είναι ανεπαρκή, ενώ η κλιμακοσημότητα αναφέρεται στους ακριβούς υπολογισμούς που απαιτούνται από τη ΣΔ. Για την αντιμετώπιση του προβλήματος κλιμακοσημότητας προτείνουμε μια μέθοδο Αυξητικής Συνεργατικής Διήθησης (ΑΣΔ) που βασίζεται σε αυξητικές αναπροσαρμογές των ομοιοτήτων μεταξύ χρηστών. Ο ΑΣΔ αλγόριθμός μας (α) δεν είναι βασισμένος σε κάποια μέθοδο προσέγγισης, κατά συνέπεια δίνει τη δυνατότητα για υψηλής ποιότητας συστάσεις, και (β) παρέχει συστάσεις γρηγορότερα από τη μέθοδο κλασικής ΣΔ και είναι κατάλληλος για την ηλεκτρονικές εφαρμογές. Για την αντιμετώπιση του προβλήματος σποραδικότητας προτείνουμε μία μέθοδο βασισμένη σε χρήση λογικών συμπερασμάτων εμπιστοσύνης. Τα λογικά συμπεράσματα εμπιστοσύνης είναι μεταβατικές ενώσεις μεταξύ των χρηστών στα πλαίσια ενός υποκείμενου κοινωνικού δικτύου και λειτουργούν ως πολύτιμες πηγές πρόσθετων πληροφοριών που βοηθούν στην ελάφρυνση του προβλήματος της σποραδικότητας. Η πειραματική αξιολόγηση που ακολουθούμε αποδεικνύει ότι η μέθοδός μας βελτιώνει σημαντικά την ποιοτική απόδοση της κλασικής μεθόδου ΣΔ. Τέλος, παρέχουμε έναν οδικό χάρτη για μελλοντικές ερευνητικές κατευθύνσεις που επεκτείνουν τις τεχνολογίες σύστασης σε πιο σύνθετους τύπους εφαρμογών και προσδιορίζουν διάφορες ερευνητικές ευκαιρίες
    corecore